Skip to content

HIVE-29616: Fix incorrect column lineage when multiple subqueries with identical table aliases#6485

Merged
deniskuzZ merged 3 commits into
apache:masterfrom
ljq-dmr:HIVE-29616
Jun 1, 2026
Merged

HIVE-29616: Fix incorrect column lineage when multiple subqueries with identical table aliases#6485
deniskuzZ merged 3 commits into
apache:masterfrom
ljq-dmr:HIVE-29616

Conversation

@ljq-dmr
Copy link
Copy Markdown
Contributor

@ljq-dmr ljq-dmr commented May 14, 2026

What changes were proposed in this pull request?

lineage column Predicate baseCols

Why are the changes needed?

The logic in ExprProcFactory#findSourceColumn resolves source columns from TopOps by matching table and field aliases. If a match is found, it returns the result directly. This implementation fails in scenarios involving multiple subqueries with identical table aliases (e.g., in a UNION statement). Because the search returns the first match it encounters, it may link to the wrong source column from a different subquery branch, leading to incorrect lineage

Does this PR introduce any user-facing change?

No

How was this patch tested?

mvn test -Pitests -pl itests/qtest -Dtest=TestMiniLlapLocalCliDriver -Dqfile=lineage8.q

Comment thread ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/ExprProcFactory.java Outdated
Copy link
Copy Markdown
Member

@deniskuzZ deniskuzZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@sonarqubecloud
Copy link
Copy Markdown

@deniskuzZ deniskuzZ merged commit 99b61cc into apache:master Jun 1, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants